Protein Fold Recognition with K-Local Hyperplane Distance Nearest Neighbor Algorithm

نویسنده

  • Oleg Okun
چکیده

This paper deals with protein structure analysis, which is useful for understanding function of proteins and therefore evolutionary relationships, since for proteins, function follows from form (shape). One of the basic approaches to structure analysis is protein fold recognition (protein fold is a 3-D pattern), which is applied when there is no significant sequence similarity between structurally similar proteins. It does not rely on sequence similarity and can be achieved with relevant features extracted from protein sequences. Given (numerical) features, one of the existing machine learning techniques can be then applied to learn and classify proteins represented by these features. In this paper, we experiment with the K-Local Hyperplane Distance Nearest Neighbor algorithm (HKNN) [12] applied to protein fold recognition. The goal is to compare it with other methods tested on a real-world dataset [3]. Two tasks are considered: 1) classification into four structural classes of proteins and 2) classification into 27 most populated protein folds composing these structural classes. Preliminary results demonstrate that HKNN can successfully compete with other methods (by both speed and accuracy) and thus encourage its further exploration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

-Local Hyperplane Distance Nearest-Neighbor Algorithm and Protein Fold Recognition

Two proteins may be structurally similar but not have significant sequence similarity. Protein fold recognition is an approach usually applied in this case. It does not rely on sequence similarity and can be achieved with relevant features extracted from protein sequences. In this paper, we experiment with the K -local hyperplane distance nearest-neighbor algorithm [8] applied to the protein fo...

متن کامل

Diagnosis of Breast Cancer Tissues Using 785 nm Miniature Raman Spectrometer and Pattern Regression

For achieving the development of a portable, low-cost and in vivo cancer diagnosis instrument, a laser 785 nm miniature Raman spectrometer was used to acquire the Raman spectra for breast cancer detection in this paper. However, because of the low spectral signal-to-noise ratio, it is difficult to achieve high discrimination accuracy by using the miniature Raman spectrometer. Therefore, a patte...

متن کامل

Classification by ALH-Fast Algorithm

The adaptive local hyperplane (ALH) algorithm is a very recently proposed classifier, which has been shown to perform better than many other benchmarking classifiers including support vector machine (SVM), K-nearest neighbor (KNN), linear discriminant analysis (LDA), and K-local hyperplane distance nearest neighbor (HKNN) algorithms. Although the ALH algorithm is well formulated and despite the...

متن کامل

Feature Normalization and Selection for Protein Fold Recognition

Protein is an amino acid sequence. To determine protein function, which is important in understanding evolutionary relationships, fold recognition is one of the promising techniques to apply, especially when protein sequence identity is below 50%, so that no reliable classification can be obtained from a sequence comparison. Fold recognition is the analysis of proteins based on structure rather...

متن کامل

An affinity-based new local distance function and similarity measure for kNN algorithm

In this paper, we propose a modified version of the k-nearest neighbor (kNN) algorithm. We first introduce a new affinity function for distance measure between a test point and a training point which is an approach based on local learning. A new similarity function using this affinity function is proposed next for the classification of the test patterns. The widely used convention of k, i.e., k...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004